Geospatial Analytics for Social Good

Background

Water is an essential resource, providing healthy environment, a sustainable economy, reducing poverty and also ensuring peace and security. Importance of accessible and clear water is very critical, already 40% of the population doesn’t have access to safe drinkable water. The scarcity of water leads to imbalance in the ecosystem. There will be food scarcity as agricultural department is dependent on world’s accessible freshwater.

Scarcity of fresh water is so adverse that 80% of illness in developing countries are because of lack of sanitation and clean water. Providing clean water in rural communities of the developing countries is still a major issue in many countries globally, especially countries in Africa continent.

To address this issue a projected has been initiated with aim to collect water point related data from rural areas at the water point or small water scheme level and share the data via WPdx Data Repository, a cloud-based data library.

Objective

To address complex problems faced by society using geospatial analysis. Using appropriate global and local measures of spatial Association techniques, the spatial patterns of Not Functional water points are revealed. In this exercise, Nigeria will be used as a study point country.

The Data

Aspatial Data

The data from WPdx Global Data Repositories will be used, WPdx+ data set.

Geospatial Data

Nigeria Level-2 Administrative Boundary (also known as Local Government Area) polygon features GIS data will be used downloaded from geoBoundaries.

The Task

Data Wrangling

Following tasks will be performed in geospatial analysis:

  • Using appropriate sf method, importing the shapefile into R and save it in a simple feature data frame format.

  • Using appropriate tidyr and dplyr methods, proportion of functional and non-functional water point at LGA level is derived.

  • Combining the geospatial and aspatial data frame into simple feature data frame.

  • Performing outliers/clusters analysis by using appropriate local measures of spatial association methods.

  • Performing hotspot areas analysis by using appropriate local measures of spatial association methods.

Thematic Mapping

  • Maps are plotted to show the spatial distribution of functional and non-functional water point rate at LGA level by using appropriate thematic mapping technique provided by tmap package.

Analytical Mapping

  • Hotspot areas and outliers/clusters maps of functional and non0functional water point rate at LGA level by using appropriate thematic mapping technique provided by tmap package.

Getting Started

Setting the Analytical Tools

Installing and launching R packages into R environment

pacman::p_load(sf, spdep, tmap, tidyverse,funModeling)

Import shapefile into R environment

Importing water point geospatial

Using, st_read() of sf package, Nigeria Level-2 Administrative Boundary shapefile is imported into R environment.

wp <- st_read(dsn = "Geodata",
              layer = "geo_export",
              crs = 4326) %>%
  filter(clean_coun == "Nigeria")

Things to learn from the code chunk above:

**st_read()** of sf package is used to import geo_export shapefile into R environment and save the imported geospatial data into simple feature data table.

**filter()** of dplyr package is used to extract water point records of Nigeria. In the code chunk below, “write_rds()” of reader package is used to save the extracted sf table …..

Next, `write_rds()` of readr package is used to save the extracted sf data table (i.e. wp) into an output file in rds data format. The output file is called *wp_nga.rds* and it is saved in *geodata* sub-folder.

write_rds(wp, "Geodata/wp_nga.rds")

Importing Nigeria LGA boundary data

Now, we are going to import the LGA boundary data into R environment by using the code chunk below.

nga <- st_read(dsn = "Geodata",
               layer= "geoBoundaries-NGA-ADM2",
               crs = 4326)

Thing to learn from the code chunk above.

- `st_read()` of **sf** package is used to import *geoBoundaries-NGA-ADM2* shapefile into R environment and save the imported geospatial data into simple feature data table.

Data Wrangling

Recoding the NA values into string

In the code chunk below, `replace_na()` is used to recode all the *NA* values in *status_cle* field into *Unknown*.

wp_nga <- read_rds("Geodata/wp_nga.rds")
  mutate(status_cle = replace_na(status_cle, "Unknown"))

Exploratory Data Analysis

In the code chunk below, freq() of funModeling package is used to display the distribution of status_cle field in wp_nga.

freq(data=wp_nga, 
     input = 'status_cle')

Extracting Water Points

By using classes in status_cle field, extracting water points records.

Extracting functional water point

Using, filter() of dplyr is used to select functional water points.

wpt_functional <- wp_nga %>%
  filter(status_cle %in%
           c("Functional", 
             "Functional but not in use",
             "Functional but needs repair"))
freq(data=wpt_functional, 
     input = 'status_cle')

Extracting non-functional water points

Using, filter() of dplyr is used to select non-functional water points.

wpt_nonfunctional <- wp_nga %>%
  filter(status_cle %in%
           c("Abandoned/Decommissioned", 
             "Abandoned",
             "Non-Functional",
             "Non functional due to dry season",
             "Non-Functional due to dry season"))
freq(data=wpt_nonfunctional, 
     input = 'status_cle')

Extracting water point with unknown class

Using, filter() of dplyr is used to select water points with unknown status.

wpt_unknown <- wp_nga %>%
  filter(status_cle == "Unknown")

Performing Point-in Polygon Count

nga_wp <- nga %>% 
  mutate(`total wpt` = lengths(
    st_intersects(nga, wp_nga))) %>%
  mutate(`wpt functional` = lengths(
    st_intersects(nga, wpt_functional))) %>%
  mutate(`wpt non-functional` = lengths(
    st_intersects(nga, wpt_nonfunctional))) %>%
  mutate(`wpt unknown` = lengths(
    st_intersects(nga, wpt_unknown)))

Saving the Analytical Data Table

Learnings from code chunks:

  • mutate() of dplyr package is used to derive two fields namely pct_functional and pct_non-functional.

  • to keep the file size small, select() of dplyr is used to retain only field 3,4,9,10, 18,19,20,21,22,and 23.

nga_wp <- nga_wp %>%
  mutate(pct_functional = `wpt functional`/`total wpt`) %>%
  mutate(`pct_non-functional` = `wpt non-functional`/`total wpt`)
write_rds(nga_wp, "Geodata/nga_wp.rds")

Visualising the spatial distribution of water points

nga_wp <- read_rds("Geodata/nga_wp.rds")
#| fig-width: 14
#| fig:height: 12
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(nga_wp)+
  tm_fill(c("total wpt","wpt functional","wpt non-functional","wpt unknown"), 
          style = "quantile", 
          palette = "Blues") +
  tm_layout(main.title = "Distribution of Water Points",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.25, 
            legend.width = 0.35,
            legend.position = c("right", "bottom"),
            frame = TRUE) +
  tm_borders(alpha = 0.5)
legend.postion is used for plot mode. Use view.legend.position in tm_view to set the legend position in view mode.